Spam Filtering in Twitter Using Sender-Receiver Relationship
نویسندگان
چکیده
Twitter is one of the most visited sites in these days. Twitter spam, however, is constantly increasing. Since Twitter spam is different from traditional spam such as email and blog spam, conventional spam filtering methods are inappropriate to detect it. Thus, many researchers have proposed schemes to detect spammers in Twitter. These schemes are based on the features of spam accounts such as content similarity, age and the ratio of URLs. However, there are two significant problems in using account features to detect spam. First, account features can easily be fabricated by spammers. Second, account features cannot be collected until a number of malicious activities have been done by spammers. This means that spammers will be detected only after they send a number of spam messages. In this paper, we propose a novel spam filtering system that detects spam messages in Twitter. Instead of using account features, we use relation features, such as the distance and connectivity between a message sender and a message receiver, to decide whether the current message is spam or not. Unlike account features, relation features are difficult for spammers to manipulate and can be collected immediately. We collected a large number of spam and non-spam Twitter messages, and then built and compared several classifiers. From our analysis we found that most spam comes from an account that has less relation with a receiver. Also, we show that our scheme is more suitable to detect Twitter spam than the previous schemes.
منابع مشابه
Network-based spam filter on Twitter
Rapidly growing micro-blogging social networks, such as Twitter, have been infiltrated by large number of spam accounts. Limited to 140 characters, Twitter spam is often vastly different from traditional email spam and link spam such that conventional methods of content-based spam filtering are insufficient. Many researchers have proposed schemes to detect spammers on Twitter. Most of these sch...
متن کاملSender and Receiver Addresses as Cues for Anti-Spam Filtering
This study analysed the sender and receiver addresses of 3,417 unsolicited e-mails. Over 60.3% of unsolicited e-mails were found to have an invalid sender address and 92.8% receiver addresses did not appear in the “To” or “CC” headers. The analytical results indicated that e-mail addresses in the header could provide a cue for filtering junk e-mails.
متن کاملMinimizing the Time of Spam Mail Detection by Relocating Filtering System to the Sender Mail Server
Unsolicited Bulk Emails (also known as Spam) are undesirable emails sent to massive number of users. Spam emails consume the network resources and cause lots of security uncertainties. As we studied, the location where the spam filter operates in is an important parameter to preserve network resources. Although there are many different methods to block spam emails, most of program developers on...
متن کاملAn Efficient Spam Mail Detection by Counter Technique
Spam mails are unwanted mails sent to large number of users. Spam mails not only consume the network resources, but cause security threats as well. This paper proposes an efficient technique to detect, and to prevent spam mail in the sender side rather than the receiver side. This technique is based on a counter set on the sender server. When a mail is transmitted to the server, the mail server...
متن کاملEvaluation of Anti-spam Method Combining Bayesian Filtering and Strong Challenge and Response
Recently, various schemes against spam are proposed because of rapid increasing of spam. Some schemes are based on sender whitelisting with auto registration, a principle that a recipient reads only messages from senders who are registered by the recipient, and a sender have to perform some procedure to be registered (challenge-response.) In these schemes, some exceptions are required to show e...
متن کامل